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Abstract 



The use of probabilistic models based on copulas in EDAs (Estimation of Distribution 
Algorithms) is currently an active area of research. In this context, the copulaedas R 
package intends to provide a platform where EDAs based on copulas can be implemented 
and studied. The package offers complete implementations of various EDAs based on 
copulas and vines, a group of well-known benchmark problems, utility functions to study 
the behavior of EDAs and the possibility of implementing new algorithms that can be in- 
tegrated into the package. EDAs are implemented using S4 classes with generic functions 
for its main parts: seeding, selection, learning, sampling, replacement, local optimization, 
termination, and reporting. This paper provides an overview of EDAs based on copulas, 
describes the implementation of copulaedas and illustrates its use with examples. The 
examples include running the EDAs implemented in the package, implementing new algo- 
rithms and performing an empirical study to compare the behavior of a group of EDAs. 

Keywords: optimization, estimation of distribution algorithms, copula, vine, R. 



EDAs (Estimation of Distribution Algorithms) (Muhlenbein and Paafi 1996; Baluja 1994; 
Larrahaga and Lozano 2002; Pelikan et al. 2002) are evolutionary optimization methods char- 
acterized by the explicit use of probabilistic models. These algorithms explore the search 
space by iteratively estimating and sampling a probability distribution built from promising 
solutions. 

Due to its tractable properties, the normal distribution has been commonly used to model 
the search distributions of EDAs for real- valued optimization problems (Bosman and Thierens 
2006; Kern et al. 2003). Nevertheless, its use is often inconsistent with the empirical evidence 
and leads to the construction of incorrect models. Copula functions (Joe 1997; Nelsen 2006) 
offer an alternative to tackle these problems. By means of Sklar's Theorem (Sklar 1959), any 
multivariate distribution can be decomposed into marginal distributions and a copula that 
determines the dependence structure between the variables. EDAs based on copulas inherit 
these properties and consequently can build more realistic search distributions. 

Although several EDAs based on copulas have been proposed in the literature, there are no 
publicly available implementations of such algorithms. Aiming to fill this gap, the copulaedas 
package (Gonzalez-Fernandez and Soto 2011a) for the R language and environment for statis- 
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i <- 1 
repeat 

if i == 1 then 

Generate an initial population P\ using a seeding method. 

Evaluate the solutions in the population Pi. 

If required, apply a local optimization method to the population P\. 
else 

Select a population pp elected from Pj_i according to a selection method. 
Learn a probabilistic model Mj from p? elected using a learning method. 
Sample a new population p^ am P led f r0 m Mj using a sampling method. 
Evaluate the solutions in the population p am P led m 

If required, apply a local optimization method to the population p^ am P led _ 
Create the population Pj from Pj_i and p^ am P led us i n g a replacement method. 
end if 

If required, report progress information using a reporting method, 
i «- i + 1 

until A criterion of the termination method is met. 



Algorithm 1: General procedure of an EDA. 



tical computing (R Development Core Team 2011) has been published on the Comprehensive 
R Archive Network at http : //CRAN .R- project . org/package=copulaedas. 

The copulaedas package intends to provide a platform where EDAs based on copulas can be 
implemented and studied. It contains implementations of various EDAs based on copulas, a 
group of well-known benchmark problems and utility functions to study EDAs. 

The rest of this paper is organized as follows. Section 2 presents the necessary background on 
EDAs based on copulas. Section 3 describes the details of the implementation of copulaedas 
while Section 4 illustrates its use with examples. Finally, concluding remarks are given in 
Section 5. 



2. Estimation of distribution algorithms based on copulas 

This section begins by describing the general procedure of an EDA, according to the imple- 
mentation included in copulaedas. Then, a general overview of the EDAs based on copulas 
presented in the literature with emphasis on the algorithms implemented in the package is 
given. 

2.1. General procedure of an EDA 

The general procedure of an EDA is outlined in Algorithm 1 . Each iteration of this procedure 
is referred to as one generation of the EDA. The main steps of the algorithm are highlighted 
in italics. 

The first step of an EDA is the generation of an initial population of solutions. The initial 
population is usually generated randomly but it can be generated using a particular heuristic 
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when a priori information about the characteristics of the optimal solutions is available. In 
both cases, we call the method used to generate the initial population a seeding method. 

The results of global optimization algorithms such as EDAs can often be improved if combined 
with local optimization methods that look for better solutions in the neighborhood of each 
solution. Local optimization methods can also be used to implement repairing methods for 
constrained problems where the simulated solutions may be unfeasible and some strategy to 
repair these solutions is available. 

A selection method is used to determine the solutions to be modeled by the search distri- 
bution. These solutions are usually the most promising solutions of the population. An 
example selection method is truncation selection, which creates the selected population with 
a percentage of the best solutions of the current population. 

The estimation and simulation of the search distribution are essential steps of an EDA. These 
steps are implemented by learning and sampling methods, respectively. Both methods are 
closely related. Learning methods estimate structure and parameters of the probabilistic 
model used by the algorithm from the selected population, while sampling methods are used 
to generate a new population of solutions from the learned probabilistic model. 

A replacement method is used to incorporate a new population of solutions into the current 
population. An example replacement strategy is to replace completely the current population 
with the new population. Other replacement strategies retain the best solutions found so far 
or are used to maintain the diversity of solutions. 

Reporting methods provide progress information during the execution of the EDA. Relevant 
progress information can be the number of evaluations of the objective function and the best 
solution found so far. 

Finally, a termination method determines when the algorithm stops according to certain 
criteria; for example, a fixed number of function evaluations are realized or a certain value of 
the objective function is reached. 

We are particularly interested in EDAs whose learning and sampling steps involve probabilistic 
models based on copulas. The next section provides an overview of such algorithms. 

2.2. Overview of EDAs based on copulas 

To the best of our knowledge, the technical report (Soto et al. 2007) and the theses (Arderi 
2007; Barba-Moreno 2007) were the first attempts to incorporate copulas into EDAs. Since 
then, a considerable number of EDAs based on copula theory have been proposed in the litera- 
ture (e.g., Wang et al. 2009b, a; Salinas-Gutierrez et al. 2009; Gao 2009; Soto and Gonzalez- Fernandez 
2010; Salinas-Gutierrez et al. 2010; Cuesta-Infante et al. 2010; Ye et al. 2010; Gonzalez-Fernandez 
2011). As evidence of its increasing popularity, the use of copulas in EDAs has been identified 
as an emerging approach for the solution of real- valued optimization problems (Hauschild and Pelikan 
2011). 

In general, the learning step of copula-based EDAs consist of two parts: the estimation of the 
marginal distributions and the estimation of the probabilistic dependence structure. Usually, 
a particular distribution (e.g., normal or beta) is assumed for each margin and its parame- 
ters are estimated by maximum likelihood (Soto et al. 2007; Arderi 2007; Wang et al. 2009b, a; 
Salinas-Gutierrez et al. 2009; Soto and Gonzalez-Fernandez 2010; Salinas-Gutierrez et al. 2010; 
Ye et al. 2010; Gonzalez-Fernandez 2011; Salinas-Gutierrez et al. 2011). In other cases, ker- 



4 



copulaedas: An R Package for EDAs Based on Copulas 



nel density estimation (Soto et al. 2007; Arderi 2007; Gao 2009; Salinas-Gutierrez et al. 2011) 
or empirical marginal distributions (Cuesta-Infante et al. 2010) have been used. Once the 
marginal distributions are estimated, the selected population is transformed into uniform 
variables in (0, 1) by the evaluation of each marginal cumulative distribution function. This 
transformed population is then used to estimate a copula-based model of the dependence 
structure among variables. 

The simulation step usually starts with the generation of a population of uniform variables 
in (0, 1) with the dependence structure described by the copula-based model estimated in the 
learning step. Finally, this uniform population is transformed to the domain of the variables 
through the evaluation of the inverse of each marginal cumulative distribution function. 

According to the copula model used, EDAs based on copulas can be classified as EDAs based 
on either multivariate or factorized copulas. In the rest of this section we describe algorithms 
belonging to each group. 

EDAs based on multivariate copulas 

The research on EDAs based on multivariate copulas has focused on the use of the normal cop- 
ula (Soto et al. 2007; Arderi 2007; Barba-Moreno 2007; Wang et al. 2009b) and Archimedean 
copulas (Wang et al. 2009a; Gao 2009). 

The algorithms described in (Soto et al. 2007; Arderi 2007; Barba-Moreno 2007) are theoret- 
ically similar but they present differences in the estimation of the marginal distributions and 
the use of techniques such as variance scaling. Wang et al. (2009b) present the bivariate case 
and, since only normal marginal distributions are used, the proposed algorithm is equivalent 
to EMNA (Estimation of Multivariate Normal Algorithm) (Larrahaga et al. 2001). 

The algorithms presented in (Wang et al. 2009a; Gao 2009) use exchangeable Archimedean 
copulas. Wang et al. (2009a) propose two algorithms that use Clayton and Ali-Mikhail-Haq 
copulas, respectively. In this work, the parameters of the copulas are not estimated from the 
selected population. Gao (2009) does not state which members of the family of Archimedean 
copulas are used in the algorithm. 

Two EDAs based on multivariate copulas are implemented in copulaedas, one is based on the 
product or independence copula and the other on the normal copula. 

The first algorithm is UMDA (Univariate Marginal Distribution Algorithm) for continuous 
variables (Larrahaga et al. 1999, 2000). UMDA can be integrated into the framework of EDAs 
based on copulas, although originally it was not defined in terms of copulas. A consequence 
of Sklar's Theorem is that random variables are independent if and only if the underlying 
copula is the product copula. Thus, UMDA can be described as an EDA based on copulas 
that models the dependence structure between the variables using a multivariate product 
copula. 

The second EDA based on a multivariate copula implemented in copulaedas is GCEDA (Gaus- 
sian Copula Estimation of Distribution Algorithm) (Soto et al. 2007; Arderi 2007). This algo- 
rithm is based on the multivariate normal copula, which allows the construction of multivariate 
distributions with normal dependence structure and non-normal margins. The dependence 
structure of the multivariate normal copula is determined by a positive-definite correlation 
matrix. If the marginal distributions are not normal, the correlation matrix is estimated by 
the inversion of the non-parametric estimator of Kendall's tau for each pair of variables (see 
e.g., Nelsen 2006). If the resulting matrix is not positive-definite, the correction proposed by 
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Rousseeuw and Molenberghs (1993) can be applied. GCEDA is equivalent to EMNA when 
all marginal distributions are normal. 

The implementation of UMDA and GCEDA provided by copulaedas has been used for the 
solution of a real- world problem known as the molecular docking problem (Soto et al. 2012). 

EDAs based on copula factorizations 

The use of multivariate copulas to model the dependence structure between variables offers 
several advantages over the use of a multivariate normal distribution; nevertheless, it presents 
limitations. The number of tractable copulas when more than two variables are involved 
is limited. In fact, most of the available parametric copulas are bivariate. Moreover, the 
multivariate copulas are not appropriate when all pairs of variables do not have the same 
dependence structure. Another limitation is that multivariate extensions, such as exchange- 
able Archimedean copulas or the multivariate t copula, have only one parameter to describe 
certain aspects of the overall dependence. This can be a serious issue when there are pairs of 
variables with different patterns of dependence. 

One alternative is to use copula factorizations that build high-dimensional probabilistic models 
by using lower-dimensional copulas as building blocks. Several EDAs based on copula factor- 
izations have been proposed in the literature (Salinas-Gutierrez et al. 2009; Soto and Gonzalez-Fernandez 
2010; Salinas-Gutierrez et al. 2010; Cuesta-Infante et al. 2010; Ye et al. 2010; Gonzalez-Fernandez 
2011), although the authors are not always aware of the limitations of the multivariate copula 
approach and present the algorithms without comments about these issues. 

The EDA introduced in (Salinas-Gutierrez et al. 2009) is an extension of MIMIC (Mutual In- 
formation Maximization for Input Clustering) for continuous domains (Larrahaga et al. 1999, 
2000) that uses bivariate copulas in a chain structure instead of bivariate normal distribu- 
tions. Two instances of this algorithm were presented, one uses normal copulas and the other 
Frank copulas. This algorithm is described in more detail in Section 4.2, where we illustrate 
the implementation of a copula-based EDA by using copulaedas. 

The exchangeable Archimedean copulas employed in (Wang et al. 2009a; Gao 2009) repre- 
sent highly specialized dependence structures (Berg and Aas 2007; McNeil 2008). Nested 
Archimedean copulas provide a more flexible way to build multivariate Archimedean copulas. 
Among the different nesting structures that have been proposed in the literature (see e.g., 
Berg and Aas 2007 for a review), hierarchically nested Archimedean copulas present one of 
the most flexible structures. Ye et al. (2010) propose an EDA that uses a representation of 
hierarchically nested Archimedean copulas based on Levy subordinators (Hering et al. 2010). 

Cuesta-Infante et al. (2010) investigate the use of bivariate empirical copulas and a multivari- 
ate extension of Archimedean copulas. The EDA based on bivariate empirical copulas is com- 
pletely nonparametric: it employs empirical marginal distributions and a construction based 
on bivariate empirical copulas to represent the dependence among variables. The marginal 
distributions and the bivariate empirical copulas are defined through the linear interpolation 
of the sample in the selected population. The EDA based on Archimedean copulas uses a 
construction similar to a fully nested Archimedean copula. This algorithm uses copulas from 
one of the families Frank, Clayton or HRT (i.e., heavy right tail copula or Clayton survival 
copula). The parameters of the copulas are not estimated from the selected population but 
fixed to a constant value. The marginal distributions are modeled as in the EDA based on 
bivariate empirical copulas. 
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The class of VEDAs (Vine EDAs) was introduced in (Soto and Gonzalez-Fernandez 2010; 
Gonzalez- Fernandez 2011). Algorithms of this class model the search distributions using 
vines (Joe 1996; Bedford and Cooke 2001; Aas et al. 2009), which are graphical models that 
represent high-dimensional distributions by decomposing the multivariate density into bivari- 
ate copulas and one-dimensional densities. A vine on n variables is a set of nested trees 
Xi, . . . , T n _i, where the edges of tree Tj are the nodes of the tree Tj+i with j = 1, . . . ,n — 2. 
The edges of the trees represent the bivariate copulas in the decomposition. Since all bivariate 
copulas do not have to belong to the same family, vines model a rich variety of dependences 
by combining bivariate copulas from different families. 

C-vines (canonical vines) and D-vines (drawable vines) are two types of vines, each of which 
determine a specific decomposition of the multivariate density. In a C-vine, each tree Tj has 
a unique root node that is connected to n — j edges. In a D-vine, no node is connected to more 
than two edges. Two EDAs based on these models were presented in (Soto and Gonzalez-Fernandez 
2010; Gonzalez-Fernandez 2011): CVEDA (C-Vine EDA) and DVEDA (D-Vine EDA) based 
on C-vines and D-vines, respectively. Since both algorithms are implemented in copulaedas, 
we describe them in more detail in the rest of this section. 

The general idea of inference and simulation methods for C-vines and D-vines was developed 
by Aas et al. (2009). The inference algorithm should consider two main aspects: the selection 
of the structure of the vines and the choice of the bivariate copulas in the factorization. The 
simulation algorithm is based on the conditional distribution method (see e.g., Devroye 1986). 

At each generation of CVEDA and DVEDA, the selection of the structure of the vine involves 
selecting the bivariate dependences that will be explicitly modeled in the first tree. This is 
accomplished by using greedy heuristics based on the empirical Kendall's tau between each 
pair of variables in the selected population assigned to the edges of the tree. In a C-vine, the 
node that maximizes the sum of the weights of its edges to the other nodes is chosen as the 
root node of the first tree. In a D-vine, the problem of constructing the first tree consists 
of finding the maximum weighted sequence of the variables. Brechmann (2010) transforms 
this problem into a TSP (Traveling Salesman Problem) instance. For efficiency reasons, 
in copulaedas we find an approximate solution of the TSP by using the cheapest insertion 
heuristic (Rosenkrantz et al. 1977). 

The selection of each bivariate copula in both decompositions starts with an independence 
test (Genest and Remillard 2004; Genest et al. 2007). The product copula is selected if there 
is not enough evidence against the null hypothesis of independence at a given significance 
level. In the other case, the parameters of a group of candidate copulas are estimated and the 
copula that minimizes a Cramer-von Mises statistic based on the empirical copula is selected 
(Genest and Remillard 2008). 

The cost of the construction of C-vines and D-vines increases with the number of variables. 
To simplify the construction we apply the truncation strategy presented in (Brechmann 2010). 
If a vine is truncated at a given tree, all the copulas in the subsequent trees are assumed to 
be product copulas. A model selection procedure based on either AIC (Akaike Information 
Criterion) (Akaike 1974) or BIC (Bayesian Information Criterion) (Schwarz 1978) is applied 
to detect the required number of trees. This procedure expands the tree Tj + \ if the value of 
the information criterion calculated up to the tree Ij+i is smaller than the value obtained up 
to the previous tree; otherwise, the vine is truncated at the tree Tj. 

At this point, it is important to note that the algorithm presented in (Salinas-Gutierrez et al. 
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2010) also uses a D-vine. In this algorithm only normal copulas are used in the first two trees 
and conditional independence is assumed for the rest of the trees, i.e., the D-vine is always 
truncated at the second tree. 

The implementation of CVEDA and DVEDA included in copulaedas uses the truncation 
procedure based on AIC and the candidate copulas normal, t, Clayton, Frank and Gumbel. 
The parameters of all copulas but the t copula are estimated using the inversion of Kendall's 
tau. For the t copula, the correlation coefficient is computed as in the normal copula and 
the degrees of freedom are estimated by maximum likelihood with the correlation parameter 
fixed (Demarta and McNeil 2005). 

Gonzalez-Fernandez (2011) presents a study about intrinsic characteristics of CVEDA and 
DVEDA, such as the impact of the truncation procedure and the effect of the selection of 
the structure of C-vines and D-vines. The implementation of these algorithms provided 
by copulaedas also has been used in the solution of the molecular docking problem with 
satisfactory results (Soto et al. 2012). 

3. Implementation in R 

The implementation of copulaedas follows an object-oriented design inspired by the Mateda- 
2.0 (Santana et al. 2010) toolbox for MATLAB. Each EDA implemented in the package is 
represented by an S4 class (Chambers 2008) with generic functions for its main steps. 

The main class of the package is EDA. This class is the base class of all classes implementing 
EDAs in the package. The EDA class has two slots: name and parameters. The name slot 
stores the name of the EDA and it is used by the show generic function to print the name 
of the algorithm when invoked with an EDA instance as argument. The parameters slot has 
greater importance, since it keeps all the parameters of the EDA in a list. 

Each step of the general procedure of an EDA outlined in Algorithm 1 is represented in R 
by a generic function that expects an EDA instance as its first argument. Table 1 shows a 
description of these functions and its default methods. The help page of each generic function 
in the documentation of copulaedas contains information about its arguments, the return 
value, and the methods for each generic function already implemented in the package. 

Generic functions that implement the steps of an EDA look at the parameters slot of the EDA 
instance received as first argument for the values of the parameters that affect its behavior. 
Only named members of the list should be used and reasonable default values should be 
assumed when a certain component is missing. For example, the edaSeedUnif orm method for 
the edaSeed generic function consults the popSize member of the list to know the number of 
solutions to be generated for the initial population. If the popSize component is not defined, 
an initial population of 100 solutions is generated. The help page of each generic function 
describes the members of the list in the parameters slot interpreted by each function and its 
default values. 

The edaRun function implements the Algorithm 1 by linking together the generic functions for 
each step. This function expects four arguments: the EDA instance, the objective function and 
two vectors specifying the lower and upper bounds of the variables of the objective function. 
The length of the vectors with the lower and upper bounds should be the same, since it 
determines the number of variables of the objective function. When edaRun is called, it runs 
the main loop of the EDA until the call to the edaTerminate generic function returns TRUE. 
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Generic function 


Description 


edaSeed 


Seeding method. The default method edaSeedUnif orm generates 




the values of each variable in the initial population from a con- 




tinuous uniform distribution. 


edaOptimize 


Local optimization method. The use of a local optimization 




method is disabled by default. 


edaSelect 


Selection method. The default method edaSelectTruncation 




implements truncation selection. 


edaLearn 


Learning method. No default method. 


edaSample 


Sampling method. No default method. 


edaReplace 


Replacement method. The default method edaReplaceComplete 




completely replaces the current population with the new popu- 




lation. 


edaReport 


Reporting method. Reporting progress information is disabled by 




default. 


edaTerminate 


Termination method. The default method edaTerminateMaxGen 




ends the execution of the algorithm after a maximum number of 




generations. 



Table 1: Description of the generic functions that implement the steps of the general procedure 
of an EDA outlined in Algorithm 1 and its default methods. 



Then, the function returns an instance of the EDAResult class that encapsulates the results 
of the algorithm. A description of the slots of this class is shown in Table 2. 

Two subclasses of EDA are defined in the package: CEDA, that represents EDAs based on 
multivariate copulas; and VEDA, that represents EDAs based on vines. The implementation of 
UMDA, GCEDA, CVEDA and DVEDA strongly relies on the copula (Kojadinovic and Yan 
2010) and vines (Gonzalez-Fernandez and Soto 2011b) R packages. These packages implement 
the algorithms for the estimation and simulation of the probabilistic models used by these 
EDAs. 



Slot 


Description 


eda 


EDA instance. 


f 


Objective function. 


lower 


Lower bounds of the variables of the objective function. 


upper 


Upper bounds of the variables of the objective function. 


numGens 


Total number of generations. 


fEvals 


Total number of evaluations of the objective function. 


bestSol 


Best solution. 


bestEval 


Evaluation of the best solution. 


cpuTime 


Run time of the algorithm in seconds. 



Table 2: Description of the slots of the EDAResult class. 
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4. Using copulaedas 

In this section, we illustrate how to use copulaedas. The first group of examples show how to 
run the EDAs implemented in the package. Next, we explain how to implement a new EDA 
based on copulas by using the functionalities provided by the package. Finally, we show how 
to perform an empirical study to compare a group of EDAs. 

Six well-known test functions are used in the examples. The objective functions are Sphere, 
Griewank, Ackley, Summation Cancellation, Rastrigin and Rosenbrock. The corresponding 
R functions are f Sphere, fGriewank, f Ackley, f SummationCancellation, fRastrigin and 
f Rosenbrock. The definition of each function for a vector x = (xi, . . . , x n ) is given below. 

n 

„2 



/Sphere (x) = £\ 
i=l 

n 2 n / \ 

/Gnewan k (-) = l+E i ^-ncOs(^j 



/Ackiey(aO = -20 exp -0.2, 



1A „ l n 



x2 J - eX P ( _ COS ( 27FX i) J + 20 + ex p ( x ) 



n . 



/Summation Cancellation {x) i n— 5 _i_ Y~*" I 1)2/1 *^1) Vi Vi—X ~\~ ■^i 

iu + 2^1=1 \ Vi\ 

n 

/Ra S trigin(«) = { X i ~ 10cOS ( 27r2; j) + 10 ) 
i=l 

n—l , „ v 

/Rosenbrock (x) = ^ ( 100 \X i+ l - xfj + (1 - Xif J 
i=l ^ ' 

Sphere, Griewank, Ackley, Rastrigin and Rosenbrock are minimization problems. Summa- 
tion Cancellation is originally a maximization problem but it is implemented in the pack- 
age as a minimization problem. Sphere, Griewank, Ackley and Rastrigin have their global 
optimum at x = (0, . . . , 0) with evaluation zero, Summation Cancellation has its global op- 
timum at x = (0,... ,0) with evaluation — 10 5 and Rosenbrock has its global optimum at 
x = (1, . . . , 1) with evaluation zero. For a description of the characteristics of these functions 
see (Bengoetxea et ol. 2002; Bosnian and Thierens 2006; Chen and Lim 2008). 

The results presented in this section were obtained using R version 2.13.0 with copulaedas 
version 1.1.0, copula version 0.9-7 and vines version 1.0.3. Computations were performed on 
an Intel(R) Pentium(R) D CPU 3.00 GHz processor. 

In the rest of this section, we assume copulaedas and the packages it depends on have been 
loaded. This can be attained by running the following command: 

R> library ("copulaedas") 



4.1. Running the EDAs implemented in the package 

We begin by illustrating how to run the EDAs based on copulas implemented in copulaedas. 
As an example, we execute GCEDA to optimize Sphere in five dimensions. 
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GCEDA is represented in the package by the CEDA class for EDAs based on multivariate 
copulas. Before creating a new instance of the class, we set up the generic functions for the 
steps of the EDA according to the expected behavior of the algorithm. 

The termination criterion is either to find the optimum of the objective function or to reach 
a maximum number of generations. That is why we set the method for the edaTerminate 
generic function to a combination of the functions edaTerminateEval and edaTerminateMaxGen 
through the auxiliary function edaTerminateCombined. 

fi> setMethod (" edaTerminate " , "EDA", 

+ edaTerminateCombined (edaTerminateEval , edaTerminateMaxGen)) 

The method for the edaReport generic function is set to edaReportSimple to make the 
algorithm print progress information at each generation. This function prints one line at each 
generation of the EDA with the minimum, mean and standard deviation of the evaluation of 
the solutions in the current population. 

fi> setMethod ("edaReport", "EDA", edaReportSimple) 

Note that these methods were set for the base class EDA and therefore they will be inherited by 
all subclasses. Generally, we find convenient to define methods of the generic functions that 
implement the steps of the EDA for the base class, except when different subclasses should 
use different methods. 

The auxiliary function CEDA can be used to create instances of the class with the same name. 
Arguments of this function are interpreted as parameters of the EDA to be added as members 
of the list in the parameters slot of the new instance. An instance of CEDA corresponding to 
GCEDA using empirical marginal distributions smoothed with normal kernels can be created 
as follows: 

fi> gceda <- CEDA(copula = "normal" , margin = "kernel" , 

+ popSize = 200, fEval = 0, fEvalTol = le-6, maxGen = 50) 

R> gceda@name <- "Gaussian Copula Estimation of Distribution Algorithm" 

The methods that implement the generic functions edaLearn and edaSample for CEDA in- 
stances expect three parameters. The copula parameter specifies the multivariate copula, it 
should be set to "normal" for GCEDA. The marginal distributions are determined by the 
value of margin. All EDAs implemented in the package use this parameter for the same 
purpose. As margin is set to "kernel", the algorithm will look for three functions named 
fkernel, pkernel and qkernel already defined in the package to fit the parameters of the 
margins and to evaluate the distribution and quantile functions, respectively. The fkernel 
function computes the bandwidth parameter of the normal kernel according to the rule-of- 
thumb of Silverman (1986) and pkernel implements the empirical cumulative distribution 
function. The quantile function is evaluated following the procedure described in (Azzalini 
1981). The popSize parameter determines the population size while the rest of the arguments 
of CEDA are parameters of the functions that implement the termination criterion. 

Now, we can run GCEDA by calling edaRun. The lower and upper bounds of the variables 
are set so that the values of the variables in the optimum of the function are located at 25% 
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of the interval. It was shown in (Arderi 2007) that the use of empirical marginal distributions 
smoothed with normal kernels improves the behavior of GCEDA when the initial population 
is generated asymmetrically with respect to the optimum of the function. 

R> result <- edaRun(gceda, f Sphere, rep (-300, 5), rep (900, 5)) 



Generati 



on 




Minimum 




Mean 




Std. Dev. 


1 


9 


,482661e+04 


1 


,007874e+06 


5 


,271801e+05 


2 


2 


,505026e+04 


4 


,479002e+05 


2 


.681643e+05 


3 


2 


,937460e+04 


2 


,091392e+05 


1 


,264960e+05 


4 


5 


,581990e+03 


1 


,017533e+05 


5 


.642366e+04 


5 


3 


,718832e+03 


4 


,862920e+04 


2 


.923179e+04 


6 


7 


, 132640e+02 


2 


,283238e+04 


1 


,306885e+04 


7 


1 


.381820e+03 


1 


,046310e+04 


5 


,386455e+03 


8 


3 


,068758e+02 


4 


, 943787e+03 


2 


. 744929e+03 


9 


1 


,910194e+02 


2 


, 188175e+03 


1 


.324063e+03 


10 


5 


,062607e+01 


9 


,492492e+02 


5 


,924008e+02 


11 


1 


.812797e+01 


3 


,922013e+02 


2 


,294571e+02 


12 


8 


. 197173e+00 


1 


,755258e+02 


1 


,023085e+02 


13 


6 


.354410e+00 


8 


, 173253e+01 


4 


.020714e+01 


14 


3 


.244178e+00 


4 


,448070e+01 


2 


,528163e+01 


15 


8 


,257841e-01 


i. 


,949224e+01 


1 


,085053e+01 


16 


7 


. 161607e-01 


9 


,814025e+00 


5 


,889123e+00 


17 


4 


.792109e-01 


4 


,450473e+00 


2 


,243748e+00 


18 


2 


,711282e-01 


2 


,365315e+00 


1 


,299843e+00 


19 


1 


,060200e-01 


1. 


, 121201e+00 


6 


,513729e-01 


20 


3 


.438796e-02 


5 


,884943e-01 


3 


,215504e-01 


21 


3 


353496e-02 


2 


756226e-01 


1 


773181e-01 


22 


1 


,524750e-03 


1. 


,117457e-01 


5 


.788006e-02 


23 


6 


,755015e-03 


6 


,367963e-02 


3 


,382219e-02 


24 


2 


.900885e-03 


3 


,517951e-02 


1 


,744429e-02 


25 


2 


.286571e-04 


i. 


,906427e-02 


1 


,041386e-02 


26 


8 


.635889e-04 


9 


,387225e-03 


4 


,927262e-03 


27 


5 


,090779e-04 


4 


, 185834e-03 


2 


,275273e-03 


28 


7 


,411229e-05 


1. 


,810591e-03 


1 


.063558e-03 


29 


6 


. 164902e-05 


8 


,022974e-04 


4 


,755964e-04 


30 


2 


.663512e-05 


3 


,368331e-04 


1 


,955830e-04 


31 


2 


,466309e-05 


1 


,657867e-04 


9 


,601912e-05 


32 


4 


,888452e-06 


7 


, 180790e-05 


3 


.945560e-05 


33 


3 


.626218e-06 


3 


,662413e-05 


2 


,063811e-05 


34 


1 


.424262e-06 


i. 


,737351e-05 


9 


,561411e-06 


35 


8 


,645225e-07 


8 


,936281e-06 


5 


,409103e-06 



The result variable contains an instance of the EDAResult class. A method for the show 
generic function prints the results of the execution of the algorithm. 

R> show(result) 
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Results for Gaussian Copula Estimation of Distribution Algorithm 

Best function evaluation 8.645225e-07 

No. of generations 35 

No. of function evaluations 7000 

CPU time 10.55 seconds 

Due to the stochastic nature of EDAs, it is often useful to analyze a sequence of indepen- 
dent runs of these algorithms to ensure reliable results. The edalndepRuns function sup- 
ports executing independent runs of an EDA. To avoid generating lot of unnecessary output, 
we first disable reporting progress information on each generation by setting edaReport to 
edaReportDisabled. 

fi> setMethod (" edaReport " , "EDA", edaReportDisabled) 

Now we can invoke the edalndepRuns function to perform 30 independent runs of GCEDA. 

R> results <- edalndepRuns (gceda, f Sphere, rep(-300 , 5), rep(900, 5), 30) 

The return value of the edalndepRuns function is an instance of the EDAResults class. This 
class is simply a wrapper for a list with instances of EDAResult as members. Each member 
stores the results of an execution of the EDA. A show method for EDAResults instances prints 
a table with the results of the runs of the EDA. 

R> show (results) 





Generations 


Evaluations 


Best 


Evaluation 


CPU Time 


Run 


1 


37 


7400 


7. 


,829213e-07 


11.02 


Run 


2 


37 


7400 


1. 


,927911e-07 


10.99 


Run 


3 


38 


7600 


4 


,835526e-07 


11.35 


Run 


4 


35 


7000 


8 


,577429e-07 


10.43 


Run 


5 


35 


7000 


9 


,451111e-07 


10.40 


Run 


6 


35 


7000 


9 


,302367e-07 


10.40 


Run 


7 


36 


7200 


9 


,909941e-07 


10.74 


Run 


8 


37 


7400 


4 


,724555e-07 


11.02 


Run 


9 


35 


7000 


5 


,222054e-07 


10.32 


Run 


10 


36 


7200 


4 


,951474e-07 


10.69 


Run 


11 


37 


7400 


8 


,988045e-07 


11.03 


Run 


12 


34 


6800 


9 


,875791e-07 


10.07 


Run 


13 


36 


7200 


9 


, 133316e-07 


10.69 


Run 


14 


38 


7600 


5 


,486231e-07 


11.34 


Run 


15 


34 


6800 


2 


,389598e-07 


10.18 


Run 


16 


35 


7000 


9 


,484775e-07 


10.35 


Run 


17 


31 


6200 


6 


,356923e-07 


9.22 


Run 


18 


36 


7200 


7 


,726292e-07 


10.80 


Run 


19 


35 


7000 


8 


,990815e-07 


10.31 


Run 


20 


34 


6800 


7 


,237225e-07 


10.13 
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Run 


21 


36 


7200 


6 


,837205e-07 


10. 


,74 


Run 


22 


35 


7000 


6 


, 937608e-07 


10, 


,38 


Run 


23 


34 


6800 


8 


,049177e-07 


10. 


,13 


Run 


24 


34 


6800 


9 


,615774e-07 


10. 


,14 


Run 


25 


39 


7800 


2 


,666713e-07 


11. 


,56 


Run 


26 


35 


7000 


6 


,968256e-07 


10. 


,43 


Run 


27 


36 


7200 


5 


, 148476e-07 


10. 


,74 


Run 


28 


36 


7200 


6 


,639083e-07 


10. 


,69 


Run 


29 


35 


7000 


7 


,000933e-07 


10. 


,39 


Run 


30 


37 


7400 


1 


,578787e-07 


11. 


,04 



Also, the summary function can be used to generate a table with a statistical summary of the 
results of the 30 runs of the algorithm. 



R> summary (results) 



Generations Evaluations Best Evaluation CPU Time 
Minimum 31.000000 6200.0000 1.578787e-07 9.22000 

Median 35.500000 7100.0000 6.984595e-07 10.56000 

Maximum 39.000000 7800.0000 9.909941e-07 11.56000 

Mean 35.600000 7120.0000 6 . 794754e-07 10.59067 

Std. Dev. 1.566899 313.3798 2.462059e-07 0.47365 



4.2. Implementation of a new EDA based on copulas 

Now, we illustrate how to implement a new EDA based on copulas by using copulaedas. Since 
the algorithm in question matches the general procedure of an EDA presented in Algorithm 1, 
only the functions corresponding to the learning and simulation steps have to be implemented. 
The main loop and the rest of the steps of the EDA are already implemented in the package. 

As an example, we implement the extension of MIMIC for continuous domains proposed in 
(Salinas-Gutierrez et al. 2009). Similar to MIMIC, this extension learns a chain dependence 
structure but it uses bivariate copulas instead of bivariate normal distributions to model 
dependences. 

Two instances of the extension of MIMIC based on copulas were presented in (Salinas-Gutierrez et al. 
2009), one uses bivariate normal copulas while the other uses bivariate Frank copulas. In this 
article, the algorithm will be denoted as Copula MIMIC. 

The first step in the implementation of a new EDA using copulaedas is to define an S4 class 
to represent the algorithm. The new class should inherit from EDA. For convenience, we also 
define an auxiliary function CopulaMIMIC that can be used to create new instances of the 
class with the same name. 

fi> setClass ( "CopulaMIMIC" , contains = "EDA", 

+ prototype = prototype (name = "Copula MIMIC")) 

R> CopulaMIMIC <- function (...) { 

+ new (" CopulaMIMIC" , parameters = list( . . . ) ) 

+ } 
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Copula MIMIC models the marginal distributions with the beta distribution. A linear trans- 
formation is used to map the sample of the variables in the selected population into the 
(0, 1) interval to match the domain of definition of the beta distribution. Note that this 
transformation does not affect the dependence between the variables because the copula is 
scale- invariant . 

To be consistent with the marginal distributions already implemented in copulaedas, we define 
three functions with the common suffix betamargin and the prefixes f , p and q to fit the 
parameters of the margins and for the evaluation of the distribution and quantile functions, 
respectively. By following this convention, the algorithms already implemented in the package 
can use beta marginal distributions by setting the margin parameter to "betamargin". The 
betamargin suffix was selected to avoid the redefinition of the pbeta and qbeta functions of 
the stats package. 

fi> fbetamargin <- function (x, lower, upper) { 
+ x <- (x - lower) / (upper - lower) 

+ loglik <- function (s) sum(dbeta(x, s[l], s[2], log = TRUE)) 

+ s <- optim(c(l , 1) , loglik, control = list(fnscale = -l))$par 

+ list (lower = lower, upper = upper, a = s[l], b = s[2]) 

+ } 

R> pbetamargin <- function (q, lower, upper, a, b) { 
+ q <- (q - lower) / (upper - lower) 

+ pbeta(q, a, b) 

+ } 

R> qbetamargin <- function (p, lower, upper, a, b) { 

+ q <- qbeta (p, a, b) 

+ lower + q * (upper - lower) 

+ } 

The CopulaMIMIC class inherits methods for the generic functions that implement all the steps 
of the EDA except learning and sampling. To complete the implementation of the algorithm, 
we implement the estimation and simulation of the probabilistic model as methods for the 
generic functions edaLearn and edaSample, respectively. 

The method for edaLearn starts with the estimation of the parameters of the marginal dis- 
tributions and the transformation of the selected population to uniform variables in (0, 1) by 
the evaluation of the marginal cumulative distribution functions. Then, the mutual informa- 
tion between all pairs of variables is calculated through the copula entropy (Davy and Doucet 
2003). To accomplish this, the parameters of each possible bivariate copula should be esti- 
mated. The parameters of the copulas are estimated by the method of maximum likelihood. 
The value of the parameter obtained by the inversion of Kendall's tau is used as an initial 
approximation. 

To determine the chain dependence structure learned by the algorithm, a permutation of 
the variables that maximizes the pairwise mutual information is selected. Because this is 
a computationally intensive task, a greedy algorithm is used to compute an approximate 
solution (De Bonet et al. 1997; Larranaga et al. 1999). 

Finally, the method for edaLearn returns a list with three components that represents the 
probabilistic model learned in the generation: the parameters of the marginal distributions, 
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the permutation of the variables and the copulas in the chain dependence structure. 
fl> edaLearnCopulaMIMIC <- function (eda, gen, previousModel , 



+ selectedPop, selectedEval , lower, upper) { 

+ margin <- eda@parameters$margin 

+ copula <- eda@parameters$copula 
+ 

+ if (is .null (margin) ) margin <- "betamargin" 

+ if (is .null (copula) ) copula <- "normal" 
+ 

+ f margin <- get(paste("f", margin, sep = "")) 

+ pmargin <- get (paste ( "p" , margin, sep = "")) 

+ copula <- switch(copula, 

+ normal = normalCopula(O) , frank = frankCopula(O) ) 
+ 

+ n <- ncol (selectedPop) 
+ 

+ # Estimate the parameters of the marginal distributions . 

+ margins <- lapply (seqdength = n) , 

+ function (i) f mar gin (selectedPop [ , i] , lower [i] , upper [i])) 

+ uniformPop <- sapply (seqdength = n) , 

+ function (i) do . call (pmargin, 

+ c(list (selectedPop [ , i] ) , margins [[i]]))) 
+ 

+ # Calculate pairwise mutual information by using copula entropy. 

+ C <- matrix (list (NULL) , nrow = n, ncol = n) 

+ I <- matrix (0 , nrow = n, ncol = n) 

+ for (i in seq(from =2, to = n)) { 

+ for (j in seq(from = 1, to = i - 1)) { 

+ # Estimate the parameters of the copula. 

+ data <- cbind(uniformPop[ , i] , uniformPop [ , j]) 

+ startCopula <- fitCopula(copula, data, method = "itau", 

+ estimate . variance = FALSE) @copula 

+ C[[i, j]] <- tryCatch( 

+ fitCopula(startCopula, data, method = "ml", 

+ start = startCopulaQparameters , 

+ estimate .variance = FALSE)® copula, 

+ error = function (error) startCopula) 

+ # Calculate mutual information. 

+ if (is(C[[i, j]], "normalCopula")) { 

+ I[i, j] < 0.5 * log(l - C[[i, j]] ©parameters ~2) 

+ } else { 

+ u <- rcopula(C[[i, j]] , 100) 

+ I[i, j] <- sum(log(dcopula(C[[i, j]] , u))) / 100 

+ } 

+ C[[j, i]] <- Clli, jll; I[j, i] <- Hi, j] 

+ } 
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+ } 
+ 

+ # Pick a permutation of the variables . 

+ perm <- as . vector (arraylnd (which. max (I) , dim(I))) 

+ copulas <- C[perm[l] , perm [2]] 

+ I [perm, ] <- -Inf 

+ for (k in seq(length = n - 2)) { 

+ ik <- which. max (I [ , perm[l]] ) 

+ perm <- c(ik, perm) 

+ copulas <- c(C[perm[l] , perm [2]] , copulas) 

+ I[ik, ] <- -Inf 

+ } 
+ 

+ list (margins = margins, perm = perm, copulas = copulas) 



+ } 

R> setMethod("edaLearn" , "CopulaMIMIC" , edaLearnCopulaMIMIC) 

The method for the edaSample generic function receives the representation of the probabilistic 
model returned by edaLearn as the model argument. The generation of a new solution with 
n variables starts with the simulation of an n-dimensional vector U having uniform marginal 
distributions in (0, 1) and the dependence described by the copulas in the chain dependence 
structure. 

The first step is to simulate an independent uniform variable U nn in (0, 1), where ir n denotes 
the variable in the position n of the permutation n selected by the edaLearn method. The 
rest of the uniform variables are simulated conditionally on the previously simulated variable 
by using the conditional copula C(C/ 7 r fe |C/ 7rfe+1 ), with k = n — 1, n — 2, . . . , 1. 

Finally, the new solution is determined through the evaluation of the beta quantile functions 
and the application of the inverse of the linear transformation. This procedure is repeated 
for each solution to be generated. 

R> edaSampleCopulaMIMIC <- function (eda, gen, model, lower, upper) { 



+ popSize <- eda<Sparameters$popSize 

+ margin <- eda@parameters$margin 
+ 

+ if (is . null (popSize) ) popSize <- 100 

+ if (is .null (margin) ) margin <- "betamargin" 

+ 

+ qmargin <- get (paste ( "q" , margin, sep = "")) 
+ 

+ n <- length (model$margins) 

+ perm <- model$perm 

+ copulas <- model$copulas 

+ 

+ # Simulate the chain structure with the copulas. 

+ uniformPop <- matrix (0 , nrow = popSize, ncol = n) 

+ uniformPop [ , perm[n]] <- runif (popSize) 
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+ for (k in seq(from = n - 1, to = 1)) { 

+ u <- runif (popSize) 

+ v <- uniformPop [ , perm[k + 1]] 

+ uniformPop [ , perm[k]] <- hinverse (copulas [[k]] , u, v) 

+ } 
+ 

+ # Evaluate the inverse of the marginal distributions . 

+ pop <- sapply(seq(length = n) , 
+ function (i) do . call (qmargin, 

+ c (list (uniformPop [ , i]), model$margins [[i]] ))) 

+ 

+ pop 



+ } 

R> setMethodC'edaSample" , "CopulaMIMIC" , edaSampleCopulaMIMIC) 

The implementation of Copula MIMIC is now complete. As it was illustrated with GCEDA in 
the previous section, the algorithm can be executed by creating an instance of the CopulaMIMIC 
class and calling the edaRun function. 

4.3. Performing an empirical study 

Finally, we show how to use copulaedas to perform an empirical study of the behavior of a 
group of EDAs based on copulas on benchmark problems. The algorithms to be compared 
are UMDA, GCEDA, CVEDA, DVEDA and Copula MIMIC. The first three algorithms are 
included in copulaedas and the fourth algorithm was implemented in Section 4.2. All func- 
tions described at the beginning of Section 4 are considered as benchmark problems in 10 
dimensions. 

The aim of this empirical study is to assess the behavior of these algorithms when only linear 
and independence relationships are considered. Thus, only normal and product copulas are 
used in these EDAs. UMDA and GCEDA use multivariate product and normal copulas, 
respectively. CVEDA and DVEDA are configured to combine bivariate product and normal 
copulas in the vines. Copula MIMIC learns a chain dependence structure with normal copulas. 
All algorithms use normal marginal distributions. Note that in this case, GCEDA corresponds 
to EMNA and Copula MIMIC is similar to MIMIC for continuous domains. 

In the following code fragment, we create class instances corresponding to the algorithms 
described in the previous paragraph. 

fi> umda <- CEDA (copula = "indep", margin = "norm") 
R> umdaQname <- "UMDA" 

R> gceda <- CEDA (copula = "normal" , margin = "norm") 
R> gcedaQname <- "GCEDA" 

R> cveda <- VEDA (vine = "CVine" , indepTestSigLevel = 0.01, 
+ copulas = c ("normal" ) , margin = "norm") 

R> cvedaQname <- "CVEDA" 

R> dveda <- VEDA (vine = "DVine" , indepTestSigLevel = 0.01, 
+ copulas = c ("normal" ) , margin = "norm") 

R> dvedaOname <- "DVEDA" 
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R> copulamimic <- CopulaMIMIC (copula = "normal" , margin = "norm") 
R> copulamimicQname <- "CopulaMIMIC" 

The initial population is generated using the default method of the edaSeed generic function, 
therefore, it is sampled uniformly in the real interval of each variable. The lower and upper 
bounds of the variables are set so that the values of the variables in the optimum of the 
function are located in the middle of the interval. We use the intervals [—600, 600] in Sphere 
and Griewank, [—30,30] in Ackley, [—0.16,0.16] in Summation Cancellation, [—5.12,5.12] in 
Rastrigin, and [—9,11] in Rosenbrock. 

All algorithms use the default truncation selection method with a truncation factor of 0.3. 
Three termination criteria are combined using the edaTerminateCombined function: to find 
the global optimum of the function with a precision greater than 10" 6 , to reach 300000 
function evaluations or to loose diversity in the population, i.e., the standard deviation of the 
evaluation of the solutions in the population is less than 10~ 8 . These criteria are implemented 
in the functions edaTerminateEval, edaTerminateMaxEvals and edaTerminateEvalStdDev, 
respectively. 

The population size of EDAs along with the truncation method determine the sample available 
for the estimation of the search distribution. An arbitrary selection of the population size 
could lead to misleading conclusions of the results of the experiments. When the population 
size is too small, the search distributions might not be accurately estimated. On the other 
hand, the use of an excessively large population size usually does not result in a better behavior 
of the algorithms but certainly in a greater number of function evaluations. 

We advocate the use of the critical population size when comparing the performance of EDAs. 
The critical population size is the minimum population size required by the algorithm to find 
the global optimum of the function with a high success rate. To find the optimum of the 
function in 30 of 30 sequential independent runs can be generally considered a high success 
rate. 

An approximate value of the critical population size can be determined empirically using a 
bisection method (see e.g., Pelikan 2005 for a pseudocode of the algorithm). The bisection 
method begins with an initial interval where the critical population size should be located 
and discards one half of the interval at each step. This procedure is implemented in the 
edaCriticalPopSize function. In the experimental study carried out in this section, the 
initial interval is [50,2000]. If the critical population size is not found at this interval, the 
results of the algorithm with the population size determined by the upper bound are presented. 

The complete empirical study consists of performing 30 independent runs of every algorithm 
on every function using the critical population size. We proceed with the definition of a list 
containing all algorithm-function pairs. 

R> edas <- list (umda, gceda, cveda, dveda, copulamimic) 

R> fNames <- c("Sphere" , "Griewank" , "Ackley" , "SummationCancellation" , 

+ "Rastrigin", "Rosenbrock") 

R> experiments <- list() 

R> for (eda in edas) { 

+ for (fName in fNames) { 

+ experiment <- list (eda = eda, fName = fName) 

+ experiments <- c (experiments, list (experiment)) 
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+ } 
+ } 

Now we define a function to process the elements of the experiments list. This function 
implements all the experimental setup described before. The output of edaCriticalPopSize 
and edalndepRuns is redirected to a different plain-text file for each algorithm-function pair. 

R> runExperiment <- function (experiment) { 



+ eda <- experiment$eda 

+ fName <- experiment$fName 

+ 

+ # Information of the objective function. 

+ flnfo <- list( 

+ Sphere = list (lower = -600, upper = 600, fEval = 0) , 

+ Griewank = list (lower = -600, upper = 600, fEval = 0) , 

+ Ackley = list (lower = -30, upper = 30, fEval = 0) , 

+ SummationCancellation = list (lower = -0.16, upper = 0.16, 

+ fEval = -le5) , 

+ Rastrigin = list (lower = -5.12, upper = 5.12, fEval = 0) , 

+ Rosenbrock = list (lower = -9, upper = 11, fEval = 0) 

+ ) 

+ lower <- rep(f Info [[fName] ]$lower, 10) 

+ upper <- rep(f Info [[fName] ]$upper, 10) 

+ f <- get (paste ("f", fName, sep = "")) 
+ 

+ # Configure termination criteria. 

+ eda@parameters$ fEval <- f Info [[fName] ]$fEval 

+ eda@parameters$fEvalTol <- le-6 

+ eda@parameters$fEvalStdDev <- le-8 

+ eda@parameters$maxEvals <- 300000 

+ setMethod("edaTerminate" , "EDA", 

+ edaTerminateCombined(edaTerminateEval , edaTerminateMaxEvals , 

+ edaTerminateEvalStdDev) ) 

+ 

+ sink (paste (edaQname , fName, ".txt", sep = "")) 

+ # Determine the critical population size. 

+ results <- edaCriticalPopSize (eda, f, lower, upper, 

+ eda@parameters$fEval, eda@parameters$fEvalTol, lowerPop = 50, 

+ upperPop = 2000, totalRuns = 30, successRuns = 30, 

+ stopPercent = 10, verbose = TRUE) 

+ if (is .null (results) ) { 

+ # Run the experiment with the largest population size, if the 

+ # critical population size was not determined. 

+ eda@parameters$popSize <- 2000 

+ edalndepRuns (eda, f, lower, upper, runs = 30, verbose = TRUE) 

+ } 
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+ sink (NULL) 

+ } 

We can run all the experiments by calling runExperiment for each element of the list. 

R> for (experiment in experiments) { 
+ runExperiment (experiment) 

+ } 

Running the complete empirical study is a computationally demanding operation. If various 
processing units are available, it can be speeded up significantly by running the experiments 
in parallel. The snow package (Tierney et al. 2011) offers a great platform to achieve this 
purpose, since it provides a high-level interface for using a cluster of workstations for parallel 
computations in R. The functions clusterApply or clusterApplyLB can be used to call 
runExperiment for each element of the experiments list in parallel, with minor modifications 
to the code presented here. 

A summary of the results of the algorithms in Sphere, Griewank, Ackley, Summation Can- 
cellation, Rastrigin and Rosenbrock with the critical population size is shown in Tables 3, 4, 
5, 6, 7 and 8, respectively. We conclude this section with an overview of the results, since a 
detailed analysis of the behavior of each algorithm is out of the scope of this paper. 

All algorithms are able to find the global optimum of Sphere, Griewank, Ackley and Rast- 
rigin and in the 30 independent runs with similar function values. Only GCEDA, CVEDA 
and DVEDA optimize Summation Cancellation and no algorithm is capable of optimizing 
Rosenbrock. 

UMDA exhibits the best behavior in terms of the number of function evaluations in Sphere 
Griewank and Ackley. There are no strong dependences between the variables of these func- 
tions and it seems that the marginal information is enough to find the global optimum ef- 
ficiently. The other algorithms require the calculation of a greater number of parameters 
to represent the relationships between the variables, hence larger populations are needed to 
compute them reliably. The requirement of larger population sizes results in a greater number 
of function evaluations. CVEDA and DVEDA do not assume a normal dependence structure 
between the variables and for this reason are less affected by this issue. The estimation pro- 
cedure used by the vine-based algorithms selects the product copula if there is no enough 
evidence of dependence. This technique allows CVEDA and DVEDA to perform similarly to 
UMDA in these problems. 

Both UMDA and Copula MIMIC fail to optimize Summation Cancellation. A correct rep- 
resentation of the strong linear interactions between the variables of this function appears 
to be essential to find the global optimum. UMDA completely ignores this information by 
assuming independence between the variables and it exhibits the worst behavior. Copula 
MIMIC reaches better fitness values than UMDA but neither can find the optimum of the 
function. The probabilistic model estimated by Copula MIMIC can not represent important 
dependences for the success of the optimization. 

GCEDA, CVEDA and DVEDA do find the global optimum of Summation Cancellation. 
The correlation matrix estimated by GCEDA represents accurately the multivariate linear 
interactions between the variables. Thus, GCEDA has the best behavior in Summation 
Cancellation in terms of the number of function evaluations. The results of CVEDA closely 
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Algorithm 


Success 


Population 


Evaluations 


Best evaluation 


CPU time 


UMDA 


30/30 


81 


3823.2 


6.9e - 07 


0.4 








±128.3 


±2.3e - 07 


±0.0 


GCEDA 


30/30 


310 


13082.0 


6.5e - 07 


0.9 








±221.4 


±2.0e - 07 


±0.0 


CVEDA 


30/30 


104 


4777.0 


6.7e - 07 


8.3 








±118.8 


±1.8e-07 


±1.5 


DVEDA 


30/30 


104 


4787.4 


6.7e - 07 


8.2 








±100.2 


±2.0e - 07 


±1.1 


Copula MIMIC 


30/30 


150 


6495.0 


6.9e - 07 


468.8 








±209.0 


±1.7e-07 


±62.9 



Table 3: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Sphere problem with the critical population size. 



Algorithm 


Success 


Population 


Evaluations 


Best evaluation 


CPU time 


UMDA 


30/30 


111 


5224.4 


6.6e 


-07 


0.5 








±231.2 


±1.9e 


-07 


±0.0 


GCEDA 


30/30 


355 


15099.3 


6.9e 


-07 


1.2 








±414.1 


±1.8e 


-07 


±0.0 


CVEDA 


30/30 


142 


6579.3 


7.0e 


-07 


9.3 








±389.9 


±1.8e 


-07 


±2.0 


DVEDA 


30/30 


150 


6785.0 


6.5e 


-07 


9.5 








±338.1 


±2.4e 


-07 


±1.9 


Copula MIMIC 


30/30 


188 


8221.8 


6.6e 


-07 


595.7 








±220.4 


±1.8e 


-07 


±91.6 



Table 4: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Griewank problem with the critical population size. 



follow the ones of GCEDA and the latter has much better results than DVEDA. A C-vine 
provides a more appropriate modeling of the dependence structure between the variables of 
Summation Cancellation than a D-vine, since it is possible to find a variable that governs the 
interactions in the sample (Gonzalez-Fernandez 2011). 

Rastrigin is not considered a problem where the interactions between the variables play an 
important role for the success of the optimization. It is often only used to assess the effect 
of multimodality. In spite of this, Rastrigin provides interesting results about dependence 
modeling in ED As. 

All the studied algorithms find the global optimum of Rastrigin, but not all require the 
same number of function evaluations. Neither assuming independence between all pairs of 
variables nor considering a multivariate linear dependence structure lead to the best results. 
The approach of DVEDA, that lies in the middle, performs better. DVEDA constructs a 
probabilistic model that uses bivariate normal copulas if the dependence is strong and product 
copulas in the other case. In Rastrigin, the combination of normal and product copulas in 
a single probabilistic model is better than assuming independence or a multivariate linear 
dependence structure. 
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Algorithm 


Success 


Population 


Evaluations 


Best evaluation 


CPU time 


UMDA 


30/30 


81 


4997.7 


8.1e - 07 


0.6 








±88.0 


±l.le — U7 


±0.0 


GCEDA 


30/30 


279 


15633.3 


8.3e - 07 


1.6 








±258.8 


±l.le-07 


±0.0 


CVEDA 


30/30 


104 


6330.1 


8.1e-07 


10.6 








±163.2 


±1.0e-07 


±1.8 


DVEDA 


30/30 


111 


6678.5 


7.9e - 07 


11.4 








±133.8 


±1.4e-07 


±1.6 


Copula MIMIC 


30/30 


188 


10784.9 


8.0e - 07 


800.1 








±143.7 


±1.2e-07 


±122.1 



Table 5: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Ackley problem with the critical population size. 



Algorithm 


Success 


Population 


Evaluations 


Best evaluation 


CPU time 


UMDA 


0/30 


2000 


300000.0 


-5.7e + 02 


66.8 








±0.0 


±3.4e + 02 


±0.7 


GCEDA 


30/30 


355 


42434.3 


-1.0e + 05 


9.3 








±305.4 


±1.3e-07 


±0.4 


CVEDA 


30/30 


325 


44622.5 


-1.0e + 05 


537.2 








±858.3 


±1.3e-07 


±6.6 


DVEDA 


30/30 


965 


117408.3 


-1.0e + 05 


2367.3 








±959.4 


±9.3e - 08 


±20.0 


Copula MIMIC 


0/30 


2000 


300000.0 


-2.3e + 04 


10426.1 








±0.0 


±2.7e + 04 


±1054.3 



Table 6: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Summation Cancellation problem with the critical population 
size. 



Algorithm 


Success 


Population 


Evaluations Best evaluation 


CPU time 


UMDA 


30/30 


447 


33614.4 


6.7e- 


-07 


1.7 








±2452.2 


±2.3e - 


-07 


±0.1 


GCEDA 


30/30 


721 


46095.9 


6.8e- 


-07 


2.8 








±2158.2 


±1.8e- 


-07 


±0.1 


CVEDA 


30/30 


447 


32914.1 


6.6e - 


-07 


44.8 








±2011.0 


±1.7e- 


-07 


±13.3 


DVEDA 


30/30 


325 


24710.8 


7.3e- 


-07 


31.3 








±1754.3 


±1.7e- 


-07 


±6.9 


Copula MIMIC 


30/30 


386 


27315.9 


6.4e - 


-07 


1431.7 








±1673.9 


±2.1e- 


-07 


±218.6 



Table 7: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Rastrigin problem with the critical population size. 
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Algorithm 


Success 


Population 


Evaluations 


Best evaluation 


CPU time 


UMDA 


0/30 


2000 


300000.0 


8.0e + 00 


72.0 








±0.0 


±2.6e - 02 


±0.1 


GCEDA 


0/30 


2000 


300000.0 


7.5e + 00 


74.3 








±0.0 


±1.9e-01 


±0.4 


CVEDA 


0/30 


2000 


193866.6 


7.5e + 00 


1278.9 








±48243.4 


±l.le-01 


±532.0 


DVEDA 


0/30 


2000 


172200.0 


7.5e + 00 


961.4 








±35183.6 


±1.5e-01 


±386.9 


Copula MIMIC 


0/30 


2000 


139000.0 


7.6e + 00 


6173.5 








±5139.4 


±1.3e-01 


±820.5 



Table 8: Results of 30 independent runs of UMDA, GCEDA, CVEDA, DVEDA and Copula 
MIMIC in the 10-dimensional Rosenbrock problem with the critical population size. 



The chain dependence structure learned by Copula MIMIC is similar to a D-vine that only 
uses normal copulas in the first tree. This is why, Copula MIMIC also combines normal and 
product copulas, and attains the second best results in Rastrigin according to the number of 
function evaluations. 

The results of CVEDA in Rastrigin are different than the ones of DVEDA. The model used by 
DVEDA allows a freer selection of the bivariate dependences that will be explicitly modeled, 
while the model used by CVEDA has a more restrictive structure. These characteristics 
prevent CVEDA from discovering the bivariate dependences DVEDA finds and its results are 
similar to the ones of UMDA. 

Rosenbrock is generally recognized as a difficult problem for numerical optimization algo- 
rithms. Its global optimum is located inside a parabolic shaped flat region. It seems to be 
easy for algorithms to find the flat region, but convergence to the global optimum is difficult. 
This function presents nonlinear and even nonmonotone dependences between the variables. 
The normal copula can not account for this type of dependence and therefore the relationships 
between the variables are not properly represented. However, the algorithms that consider 
dependences find slightly better solutions than UMDA. A more appropriate representation 
of the relationships between the variables of the function might improve the behavior of 
copula-based EDAs in this problem. 

The running time of Copula MIMIC is considerably greater than the running time of the 
other algorithms in all functions. This situation is due to the use of a numerical optimization 
algorithm for the estimation of the parameters of the copulas by maximum likelihood. In 
the context of EDAs, where copulas are fitted at every generation, the computational effort 
required to estimate the parameters of the copulas becomes an important issue. As was 
illustrated with the behavior of CVEDA and DVEDA, using the inversion of Kendall's tau is 
a viable alternative to maximum likelihood that requires much less CPU time. 

The empirical investigation confirms the robustness of CVEDA and DVEDA in both weakly 
strongly and correlated problems. Nonetheless, the flexibility afforded by these algorithms 
comes with an increased running time when compared to UMDA or GCEDA, since the inter- 
actions between the variables have to be discovered during the learning step. 

A general result of this empirical study is that copula-based EDAs should use copulas other 
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than the product only when there is evidence of dependence in the sample of the selected 
population. Otherwise, the EDA will require larger populations and hence a greater number 
of function evaluations to accurately determine the values of the parameters of the copulas 
that correspond to independence. 

5. Concluding remarks 

We have developed copulaedas aiming to provide in a single package not only publicly available 
implementations of EDAs based on copulas but also utility functions to study these algorithms. 
In this paper, we illustrate how to run the copula-based EDAs implemented in the package, 
how to implement new algorithms and how to perform an empirical study to compare a group 
of EDAs. We hope that these functionalities help the research community to improve EDAs 
based on copulas by getting a better insight of their strengths and weaknesses and also help 
practitioners to find new applications of these algorithms to real- world problems. 
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